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A MONOTONE ISOMORPHISM THEOREM 


TERRY SOO 

Dedicated to Professor Andres del Junco, September 21, 1948 - June 17, 2015 

Abstract. In the simple case of a Bernoulli shift on two symbols, 
zero and one, by permuting the symbols, it is obvious that any two 
equal entropy shifts are isomorphic. We show that the isomor¬ 
phism can be realized by a factor that maps a binary sequence to 
another that is coordinatewise smaller than or equal to the original 
sequence. 


1. Introduction 

Let A be a positive integer, [N] = {0,1,..., A — 1}, and fl = [A]^. 
Let T : H —)■ be the left-shift given by {Tx)i = Xj+i for all z G Z. 
Given a probability measnre p on [A], we call B{p) = (f2,p^,T) a 
Bernoulli-shift on A symbols. We say that a Bernoulli shift R(q) 
is factor of B{p), if there exists a measurable map 0 : G —)■ G such 
that the push-forward of p^ under 0 is qf‘ and 0oT = To0ona 
subset of G with p^-full measure; we also call the map 0 a factor 
from B{p) to B{q). We say that the Bernoulli shifts B{p) and i?(q) 
are isomorphic if there exists a factor map 0 from B{p) to i?(q) such 
that its inverse 0“^ serves as factor map from R(q) to B{p)-, in this 
case, we call 0 an isomorphism of B{p) and i?(q). A factor map 0 
is monotone if for all x G G, we have 0(x)j < x* for all z G Z. 

Theorem 1. If p G (|, 1), then there exists a monotone isomorphism 
ofB{l—p,p) and B{p,l — p). 

Let us remark that the map dehned by 0(x)i = l[xj = 0] for all 
z G Z, which just swaps zeros and ones, is clearly an isomorphism of 
B{1 —p,p) and B{p,l —p). However, it is not monotone. 

It is easy to determine when two Bernoulli shifts are isomorphic via 
an invariant introduced by Kolmogorov [9], which is non-increasing 
under factors and preserved under isomorphisms. The entropy of a 
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probability measure p = {po,... ,pn-i) on [N] is given by H{p) := 
— logpi- Sinai [21] [ 22 ] proved that if H{p) > -ff(q), then 

S(q) is factor of -B(p), and Ornstein [161 [T^ proved that the entropies 
of two Bernoulli shifts are equal if and only if the two Bernoulli shifts 
are isomorphic. 

Although it is easy to compute the entropy of a Bernoulli shift and 
to determine whether two Bernoulli shifts are isomorphic, the actual 
factor map which realizes the isomorphism is in general a much more 
complicated object. In some special cases, the factor map has a simple 
description [I5l[2|. The hrst non-trivial example of an isomorphism is 
due to Melshalkin [15], which also gives a monotone isomorphism. I 
thank Zemer Kosloff for his help with the following example. 

Example 1 (A classical example due to Melshalkin [H]). We will 
adjust the treatment given in [12] to ensure monotonicity. Let p = 
i I’ i’ I) ^ = (i’ h h h 0)’ N = 5, and p and q are 

probability measures on [N] = {0,1, 2, 3,4}. Let x G = [5]^. We 
dehne a factor map 0 : —)■ such that if Xt = 4, then 0(x)j G {2, 3}, 

if Xi G {2, 3}, then (f){x)i = 1, and if Xi G (0,1}, then 0(x)j = 0. 

It remains to specify what happens when Xj = 4. Think of every 
Xj = 4 as a right parenthesis, and think of every Xj 7 ^ 4 as a left 
parenthesis. Ergodicity implies that every parenthesis will be matched 
legally almost surely. If Xj = 4, then let j be the position of the 
corresponding left parenthesis. If Xj is odd, then we set 0(x)i = 3, if 
Xj is even, then we set 0 (x)j = 2 . 

By dehnition, the map (j) satishes 0 o T = T o 0 and is monotone. 
Melshalkin proved that (j) is an isomorphism of B{p) and B{q). 0 

It is easy to see that a necessary condition for the existence of a 
monotone factor from i?(p) to i?(q) is that there exists a monotone 
coupling of p and q; that is, a probability measure p on [N] x [N] 
such that p(-, [N]) = p, p([A^],-) = q, and p{{n,m) : n > m} = 1. 
By Strassen’s theorem [23], the existence of a monotone coupling is 
equivalent to the condition that Yli=o Pi — Yl’i=o all 0 < fc < iV, 
in which case we say that p stochastically dominates q. 

Theorem 2 (Quas and Soo [IS])- Lef p and q be probability measures 
on [A^]. If p stochastically dominates q and H{p) is strictly greater 
than Lf(q), then there exists a monotone factor from B{p) to B{q). 

Karen Ball [1] proved Theorem [2] in the case that the measure q is 
supported on two symbols. In both of those papers, a strict entropy 
inequality is required. In this paper, we treat the case of equal entropy, 
in the special case where there are only two symbols, zero and one, in 
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each of the Bernoulli shifts. The methods used to prove Theorem [T] can 
also be adapted to produce monotone factors in other specihc cases, 
but we do not know the answer to the following question. 

Question 1. Let p and q be probability measures on [iV] such that 
p stochastically dominates q and H{p) = H{q). Does there exists a 
monotone factor from B{p) to i?(q) ? 

Russell Lyons pQ hrst posed the question of whether a monotone 
factor exists between two Bernoulli shifts. The requirement of mono¬ 
tonicity makes defining maps more difficult. In a related problem, 
Gurel-Gurevich and Peled [6l Theorem 1.3] proved that for p G (|, 1) 
there exists a monotone map (f : {0,1}^ —)■ {0,1}^ such that the prod¬ 
uct measure (p, 1 — p)^ is the push-forward of (1 — p,p)^ under 0; 
however, their map is not be equivariant; that is, it does not satisfy 
0 o T = T o 0. 

See pn [T7| |9] for more information on entropy and the isomorphism 
problem in ergodic theory. See [18] and [13] for background on factors 
in probability theory. 

The proof of Theorem [1] will involve some of the methods of [18], 
which in turn combines ideas from various treatments of the Ornstein 
and Sinai factor theorems given by Keane and Smorodinsky mm, 
Burton and Rothstein PS], del Junco PP, and Ball [1]. We briefly 
summarize some of the main features and differences in their proofs. 
Keane and Smorodinsky, and Ball employed a marker-hller method 
and a version of Hall’s marriage theorem (see Remark [H]). Del Junco 
also employed a marker-hller method, but he replaced the marriage 
lemma with his star-coupling (see Section 0]). These constructions are 
explicit and they exhibit factor maps that are £nitary-an almost surely 
continuity property (see [20] for details). In a somewhat more abstract 
approach. Burton and Rothstein proved that in a suitably dehned met¬ 
ric space, the set of all factors is a residual set, in the sense of the Baire 
category theorem. This was the approach taken in [IS], and will also 
be the approach we take here. 

Dedication 

I never had the pleasure of meeting Professor del Junco, but I wrote 
to him in December 2013 about Theorem [2] with a preprint of [IB]. He 
wrote back the same day saying he was glad that an old idea of his had 
found another application and that he always felt that the star-coupling 
was one of his best ideas. 

His coupling was a key feature in our proof of Theorem [2l and will 
also be a star feature in the proof of Theorem [1] 
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2. Coupling and Stochastic domination 

Strassen’s theorem [23] holds in the much more general setting of a 
partially ordered Polish space. The proof, even in the case of a hnite 
set is non-trivial, see for example na Theorem 10.4]. However, in the 
special case of real-valued random variables or random variables taking 
values on a hnite totally ordered set, the proof is easily obtained using 
a simple coupling of random variables. 


2.1. Quantiles. Let X be a real-valued random variable, with cumu¬ 
lative distribution function or law given by F{z) = Fx{z) := P(X < z) 
for all z G M. Dehne the generalized inverse of F via F~^{y) := 
sup {x G M : F{x) < y}. Let U be uniformly distributed in [0,1], so 
that Fu{z) = z for all 2 ; G [0,1]. We call F^^{U) the quantile rep¬ 
resentation of X. It is easy to see that the random variable Fx\U) 
has the same law as X. When we dehne random variables using the 
quantile representation sometimes we will refer to the random variable 
U as the randomization] often U will be chosen to be independent 
of any previously dehned random variables. 

If X and Y are two real-valued random variables, we say that X 
stochastically dominates Y if P(X < z) < P(y < z) for all ^ G M. 
A coupling of X and T is a pair of random variables (X', Y') dehned 
on the same probability space such that X' has the same law as X 
and Y' has the same law as Y. Let U be uniformly distributed in 
[0,1]. If we set X' := F^^{U) and Y' := then the quantile 

coupling of X and Y is given by (X',y'). We say that the coupling 
(X', Y') is monotone if X' > Y'. Strassen’s theorem implies that 
X stochastically dominates Y if and only if there exists a monotone 
coupling of X and Y. Clearly, the existence of a monotone coupling 
implies stochastic domination; on the other hand, it is easy to see that 
the quantile coupling is monotone under the assumption of stochastic 
domination. 

Let us remark that stochastic domination and the quantile coupling 
are also similarly dehned in the case that the random variables take 
values in a hnite totally ordered space. 

Lemma 3 (Strassen’s theorem via the quantile coupling). Let X and 
Y be real-valued random variables or random variables taking values in 
a finite totally ordered space. If (X', Y') is a quantile coupling of X 
and Y, then X' is almost surely greater than or equal to Y' if and only 
if X stochastically dominates Y. 
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In Section IH we will discuss an ingenious variation of the quantile 
coupling due to del Junco [3 Section 4], which will be a key ingredient 
in our proof of Theorem [1] 

2.2. An simple application of Strassen’s theorem. Lemma|3]will 

be used to prove the following simple observation, which will serve as 
the starting point in our proof of Theorem [H For two binary sequences 
X and y of the same length, we write x ^ y ii and only if Xi < yt for 
all indices i. Thus the relation ^ dehnes a partial order on the set of 
binary sequences with the same length. We write x = 1”0^ to mean a 
binary sequence of n ones followed by ^ zeros. 

Lemma 4. Let n > 1 and p G Let X = (Ai,...,X„) and 

Y = {Yi,... ,Yn) be an i.i.d. sequences of Bernoulli random variables 

with parameters p and 1—p, respectively. Let Bn be the set of size n + 1 
of all binary sequences z of length n of the form z = for some 

i G [0, n]. Let X* and Y* be random variables that have laws X and 

Y conditioned to be in Bn, respectively. Then with respect to the order 
Y, defined on binary sequences, X* stochastically dominates Y*, and 
there is a monotone coupling of X* and Y*. 

Note that in Lemma IU although the set of all binary sequences of a 
hxed length is only partially ordered by Y, the set Bn is totally ordered 
by Y- The set Bn can also be described as the set of binary sequences 
of length n that do not have a zero followed by a one. We will refer to 
Bn as a filler set. 

Proof of Lemma Lemma H] is simple consequence of the duality be¬ 
tween p and 1—p. For every integer ^ G [0,u], we have 

n—l n—l 

= ( 1 ) 

2 = 0 2=0 

this implies, using f = 0, that P(X G Bn) = P(X G Bn). Thus in order 
to prove that X* stochastically dominates Y*, it suffices to show that 
for all z G Bn, we have P(X Y z,X ^ Bn) < P(X Y z,Y E Bn). If 
2 ; = then since p > 1 — p, by equality ([I]) we have 

n—£ 

F{X^z,.XeB..) = (l-p)'(j]p'‘-'-‘(l-p)‘) 

1=0 

n—l 

i=0 

= F{YYz,YEBn). 
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The existence of a monotone coupling follows from Lemma [3l □ 


3. Markers, fillers, and joinings 


3.1. Markers. Let us fix = {0,1}^. Let x G fl. We call the interval 
[i, C Z a primary marker if Xj = 0 and Xj+i = 1. Later, we will 
dehne secondary and tertiary markers which will consist of consecutive 
primary markers. Note that two distinct primary markers have an 
empty intersection. We call an interval of Z a filler if it is nonempty 
and lies between two primary markers. Thus each x G partitions Z 
into intervals of primary markers and hllers. 

Let p G (|,1), and consider the product probability measures on 

= {0,1}^ given by /i := (1 — and u := (p, 1 — p)^- Thus 

the probability that the zeroth coordinate is a one under p is p and 
is 1 — p under u. By conditioning, an instance of a random variable 
X with law /i can be given by hrst deciding on the locations of the 
primary markers, and then deciding on the content of the hller; the 
same observation holds for a random variable Y with law u. 

To be more precise, let T = {M, F}^, where M and F are two symbols 
that stand for ‘marker’ and ‘hller.’ For each x G fl, dehne the hat map 
by setting 


x{i) 


M if i G Z is in a primary marker; 
F otherwise. 


Let r and r' be push-forwards of the measures p and u via the hat 
map. Sometimes we will refer to r as the marker measure. We have 
the following disintegration. For r-almost every t G T, there exists a 
probability measure, pt on fl, such that 


/(x)dp(x) = 



f{x)dpt{x) ]dT{t) 


for all measurable / : —)■ [0, oo). 

Remark 5. Keane and Smorodinsky m Lemma 4] give a concrete 
description of pf The measure pt assigns the sequence 01 to each 
primary marker interval of t, and is a product measure on the hller 
intervals, where on a hller interval of length n it is the law of n i.i.d. 
Bernoulli random variables with parameter p conditioned to be in the 
set Bn of sequences of consecutive ones followed by consecutive zeros 
(see Lemma Sj). The analogous result holds of r. 0 


Remark 6. Notice that the probability that the origin is contained in 
a primary marker is same under p and v. Keane and Smorodinsky [TOl 
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Lemma 3] proved that r = t' . Thus the marker measure r is the same 
for /i and v and depends only on the parameter p. This fact will also 
be important in our proof of Theorem [H 0 

3.2. Joinings. A coupling of /i and i/ is a probability measure ^ on 
n X n that has marginals /i and v] a joining is a coupling that is 
invariant under the product shift T x T, so that ^ o (T x T) = ^. A 
joining ^ is ergodic if all ^-almost sure (T x T)-invariant sets have 
measure zero or one. A coupling ^ is monotone if 

^ {{x^y) E VL X VL ■. Xi> Pi for all i G Z} = 1. 

A joining ^ is of marker form if for ^-almost every {x,y) E Q x Q 
the binary sequences x and y have the same primary markers. It follows 
from Remark O that there exists a joining of g and u in marker form. 
We will use a monotone version of this fact. 

Proposition 7. There exists a monotone joining of g and v of marker 
form. 

Proof. By Remark [HI we have r = r'. Hence we may assume that there 
exist random variables X and Y with laws g and u such that X and Y 
have the same primary markers and hller intervals. Consider a coupling 
of X and Y defined in the following way. By Remark [5l conditioned 
on the locations of the primary markers, for each hller interval / of X, 
we know that the law of restrictions of X to / is given by the law of 
a hnite sequence of i.i.d. Bernoulli random variables with parameter p 
conditioned be in a hller set; furthermore, conditioned on the locations 
of primary markers, the restrictions of X to each hller interval give 
independent random variables. The analogous statement holds for Y. 
For each hller interval I, by Lemma 01 there exists a monotone coupling 
of the restriction of X to / and the restriction of Y to I. Hence by 
applying Lemma 0] to each of the hller intervals independently, and 
leaving the primary markers alone, we obtain a coupling (X', Y') of X 
and Y whose law is monotone and of marker form. □ 

3.3. The Baire category approach of Burton and Rothstein. 

Let p E (|, 1) and J = J{p) be the set of all monotone ergodic joinings 
of /i = (1 — p,p)^ and z/ = (p, 1 — p)^ of marker form. Note that J 
is nonempty by Proposition [71 Following the approach of Burton and 
Rothstein [1], we will show the monotone isomorphisms are a residual 
set in J, when we endow J with a suitable topology. Following del 
Junco [8], we assign a complete metric to J as follows. For i > 0, 
let Ci be the set of measurable C C hi x hi that only depend on the 
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coordinates j G [—i,i]] we will call such sets cylinder sets. For any 
two measures ( and on x (which may not be joinings), set 


d-(C,0:=5^2 

i=0 


(■+‘>sup|C(C)-{(C)|. 

CGCi 


Thus d* is the usual weak-star metric. For ^ G J, let be ^ conditioned 
to have the primary markers given by t G T. Let us remark that is no 
longer a joining. Let r be the common marker measure. For E J, 
set 

d(f,C):=|d*(6,«*(()■ (2) 

Standard methods show that (J, d) is a Baire space (see for example 
[TSl Lemma 17]). We will show that the set of monotone isomorphisms 
contains an intersection of open dense sets of J, and hence is nonempty 
by the Baire category theorem. To be more precise, let denote the 
product sigma-algebra for Let V := {Pq, Pi} denote the partition of 
according the zeroth coordinate so that Pi := {x E fl : Xq = i}. Let 
C E J and let £ > 0. If there exists T' C T with r(T') > 1 — e: such that 
for every t E T', and each P E cr(V) there exists a. P' E P such that 


^((P'xfl) A (flxP)) <£, 

then we say that C, is an e-almost factor from P(l—p, p) to P(p, 1—p). 
For each £ > 0, let be the set of all e-almost factors from P(1 — p,p) 
to P(p, 1 — p). It is routine to verify that is an open set (see for 
example O page 123-24]) and that an element in the intersection of all 
the Ue dehnes a monotone factor from P(1 — p, p) to P(p, 1 — p) (see 
for example [191 Theorem 2.8]). The real work lies in verifying that Ue 
is dense; once this has been proved, the Baire category theorem gives 
that the set of monotone factors from P(1 — p, p) to P(p, 1 — p) contains 
an intersection of open dense sets, and hence is nonempty. 

Theorem [T] asserts the existence of a monotone isomorphism which 
appears to be a much stronger statement the existence of a monotone 
factor. However, one of the advantages of the Baire category approach 
is that proving the existence of the isomorphism requires little addi¬ 
tional work. We dehne an approximate factor from B{p,l — p) to 
B{1 — p, p) in the analogous way. Let C ^ J ^ind let £ > 0. If there 
exists T' C T with r(T') > 1 — e: such that for every t E T', and each 
P E cr{V) there exists a P' E P such that Ct{{P x H) A (f2 x P')) < £, 
then we say that C is an e-almost factor from B{p, 1—p) to P(l—p,p). 
For each e: > 0, let 14 be the set of all e-almost factors. Again, one can 
verify that 14 is cin open set, and that an element in the intersection 
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of all the Ve defines a monotone factor from B(p, 1 — p) to B(1 — p, p). 
Moreover, any element in the grand intersection of all the f/^ and I 4 
dehnes a monotone isomorphism. It will become apparent that the 
same proof that shows that is dense can be essentially copied to 
show that 14 is dense. Thus the Baire category theorem shows that 
the grand intersection is nonempty. 

It remains to verify that for each £ > 0, the set of £-almost 
factors is dense. Given £ > 0 and ^ G J, we need to hnd G f4 
with d(4 ^0 < W^e will dehne as a certain perturbation of ^ 
which will be obtained using del Junco’s star-coupling (TJ Section 4 
and Proposition 4.7]. 


4. The star-coupling 

Let X and V be random variables taking values on hnite sets A and 
B, respectively. In this section, we will discuss various couplings of 
X and V ; that is, random variables X' and V' dehned together on 
the same probability space with the same distribution as X and V, 
respectively. 

Let p be a joint probability mass function for X and V. We say that 
an element a G 4 is split by p if there exist distinct b,b' G B, such 
that p(a, 6) > 0 and p(a, b') > 0. For the purposes of dehning factors, 
we are interested in couplings that do not split many elements. 

Remark 8. Let us remark that if we assign an arbitrary total ordering 
to A and B, then the law of a quantile coupling of X and Y will split 
at most |R| — 1 elements of 4. 0 

Remark 9. Keane and Smorodinsky [TOl Theorem 11] proved that 
there is a coupling of X and Y with law p' that will split at most 
|R| — 1 elements of A and in addition, p' is absolutely continuous with 
respect to p; that is, p(a, 6) = 0 implies p'{a, b) = 0. A version of their 
theorem was used in the proof of Theorem [2l but we will not need to 
appeal to this result in our proof of Theorem [TJ 0 

Let X and Y be jointly distributed random variables taking values on 
totally ordered hnite sets (A, <) and (R, <), respectively. Let X' have 
the same law as X. One way to generate another random variable Y' so 
that {X', Y') has the same joint distribution as {X, Y) is to appeal to 
a quantile representation. Consider the set of conditional cumulative 
distribution functions given hj Qa '■= P(K < b \ X = a) for each 
a E A. Let U be uniformly distributed in [0,1] and independent of 
X'. Set Y' := Q^]{U). It is easy to verify that {X',Y') has the same 
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joint distribution as (X, X); we call (X', Y') the conditional quantile 
representation of (X, X). 

The next coupling we discuss is due to del Junco [3 E]. Let (Xi, Xi) 
and (X 2 , X 2 ) be random variables taking values on the hnite sets (Ai, Bi) 
and {A 2 , B 2 ), respectively. Suppose that each of the sets Ai, ^ 42 , i?i, and 
B 2 are totally ordered sets. We will dehne (Xj,X/) and (X 2 ,X 2 ) such 
that (X', X/) has the same law as (Xj, Xj) for i = 1,2. Let Ui, U 2 , and 
U be independent random variables uniformly distributed in the unit 
interval [0,1]. Let X 2 and X/ be independent random variables that 
have the same laws as X 2 and Xi, respectively; more specihcally, we 
may assume that they are given by their respective quantile represen¬ 
tations with sources of randomization given by U 2 and Ui. Next, using 
the same source of randomization U, let X 2 ' be such that (X 2 , X 2 ') is the 
conditional quantile representation of (X 2 , X 2 ), and let X[ be such that 
(X/, Xj) is the conditional quantile representation of (Xi, Xi). We refer 
to ((Xj,X/), (Xj,X 2 ')) as the star-coupling of (Xi,Xi) and (X 2 ,X 2 ). 

Remark 10. It is immediate from the dehnition the star-coupling that 
Xj is independent of (X/,Xj) and X/ is independent of (Xj,X 2 '). 0 

Remark 11. It follows from Remark El that the star-coupling of the 
random variables (Xi,Xi) and (X 2 ,X 2 ) taking values on {Ai,Bi) and 
(^ 2 , 52 ), respectively, has the property that for a hxed 02 G A 2 and 
bi G Ri, the number of ai G Ai such that there are distinct &2, &2 ^ -^2 
with both (oi, &i, 02 , 62 ) and (oi, 61 , 02 , receiving positive mass under 
the law of star-coupling (Xj, X/, Xj, X 2 ') is at most IR 2 I — 1- 0 

Remark 12. del Junco refers to his coupling as the ^-joining [3E]. 0 

We may also iterate the star-coupling to more than two pairs of ran¬ 
dom variables. For example, if (Xj, Xj) are hnite-valued random vari¬ 
ables taking values in totally ordered spaces {Ai, Bi) for i = 1, 2, 3, we 
dehne its iterated star-coupling in the following way. Take the star- 
coupling of (Xi,Xi) and (X 2 ,X 2 ) to be given by ((Xj, X/), (Xj, X 2 ')). 
Assign a lexicographic ordering to the set Ai x A 2 , and take the star¬ 
coupling of ((Xj, Xj), (Xj, X 2 ')) and (X 3 ,X 3 ), to obtain random vari¬ 
ables ((Xj', Xj', X 3 ), (X", X 2 ", X 3 ")). Notice that by dehnition of the 
star-coupling, (X", X/') has the same law as (Xj, Lj) for i = 1,2, 3. 

We will make use of the following variation of the iterated star- 
coupling. Let p be a probability measure on the hnite set A x R which 
has projections a and /3 on the sets A and R, respectively. For every k G 
7A, let and 13^ denote the A;-fold product measures on and R^, 
respectively. Let ninitiai and fcgroup > 2 be integers. Let Xj = (Xj,l^) 
be random variables with the following grouping property with law 
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p and constants ninitiai and fcgroup- For each z > 1, the random variable 
Zi takes valnes on [A x ^ ^fcgroup ^ jjfcgroup j^gg g jg^ 

has projections a^g“-°“p and on and , respectively, 

and a projection p on each copy oi A x B. Similarly, for i = 0, the 
random variable Zq takes valnes on [A x _B)"■initial gg^j pgg g jg^ ^pg|; 
has projections and /^"initini on ^"'initial and i?"initiai^ respectively, 

and a projection p on each copy oi Ax B. 

For * > 1, write X, = {X },..., X, = (F/,..., Z/ = 

(X/, Y^), = (X>,..., and Zf^^ = (X^, F.™««). 

First, consider the following conpling of Xo,Xi,Fo, and Fi. The 
resnlting conpling will not be a conpling of Zq = (Xq, Fq) and Zi = 
(Xi, Fi), bnt the resnlting conpling as a measnre on {A x _B)"initiai+fcgroup 
will have p as a projection on each copy oi A x B. Let IF = {E, F) 
have law given by the prodnct measnre p^g"°np_ (Fq, be a 

star-conpling of Zq = (Xo,Fo) and Using indepen¬ 

dent randomization, let Y^^^ be snch that the pair (Xi^®"°'"p, F/®^) is the 
conditional qnantile representation of lF^®"°'"p = (U^sroup^ ^fcg^oup^^ jg 
easy to verify from the properties of the star-conpling and the indepen¬ 
dence of (1F\ ..., lF^s''°“p) that F]^®^ is independent of moreover, 

(Fq , ( Xi , (Fi™^^, is a conpling of Zq and IF snch that (Xo,Xi) 

has law a^'inM^i+’^sToup gng[ has law /^"imtiai+fcgroup _ -^g 

will refer to this conpling as the star-couplinq with replacement 
of Fo and Fi. 

Here, two ‘replacements’ take place, Fi was replaced by IF = {E, F) 
which has the prodnct measnre p^s™“p as its law, and we only applied 
the star-conpling to Zq and IF™’®®, where in the hnal constrnction, the 
‘missing’ valne is replaced with a conditional quantile representation. 

We iterate this construction as follows. First, let (Fq, Fj) be the 
star-coupling with replacement of Zq and Fi. Next, we take the star- 
coupling with replacement of ((Xq, X(), (Fq', F/)) and F 2 ; to obtain 
random variables ((Xq,X(,X 2 ), (Fq', F/, F 2 ™‘®®, F 2 '^®^)) taking values on 

the space ^ -^ith a law that has projections 

Q,n.initiai+2fcgroup gng[ ^rii^itiB.i+'^k^Toup ^ respcctivcly. Finally, it is clear that 

this construction can be extended an arbitrary number of times in the 
obvious way. We call this construction the iterated-star coupling 
with replacement of Fq, Fi, ... F„. 

The importance of the star-coupling can be summarized in Proposi¬ 
tion [131 below; it is a version of del Junco’s [H Proposition 4.7]. 
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Proposition 13 (del Junco). Let p he a probability measure on the 
finite set Ax B and have marginals a and (3, on A and B, respectively. 
Assume that H{a) = H{/3). Let fcgroup > 2. For rj > 0, there exists 
^initial = ■'^initial^group) ^ sueh that the following holds. 

Let n G lA. Let Zi = (Xj, Yi), for i = 0,1, ■ ■ ■ ,n, have the grouping 
property with the law p and constants riinitiai and /cgmup- Define the 
following product spaces 



y^^initial y^^groupj ^ y^^initial + ^groupj 
^"^initial ^^groupi ^ ^^initial + ^groupj 


and 





group 


i)i_ 


For y = {yo,{yi,y[), - ■ (%■, yj))e Jj = x 

(i?^gr°up-i X B) X ■■■ X X B), let 


^^^initial y^ 


y = (i/i,•••,%•) e Jj. 


Let = (X„, Y„) be a random variable given by the iterative star¬ 
coupling with replacement of Zq, Zi,..., Zn. There exists a determin¬ 
istic function : I„ —?■ J„ such that P(Y„ = d'(X„)) > 1 — p. 


The proof of Proposition [T3] uses the Shannon-McMillan-Breiman 
theorem and Remark [m A version of Proposition [12] is also used the 
proof of Theorem [2] of Quas and Soo, see [HI Proposition 14], 


5. The Proof of Proposition US] 

Proof of Proposition [13. We will place conditions on Uinitiai later. Let 
h := P[{a) = H{P). Let £ > 0 such that 

h-2e> {l-^){h + e). (3) 

Set 

Lj . ^initial T ^groupj) for 0 Y j n. 

Let X G Ij be given by x = (xq, ... ,Xj). We say that x is a-good if 

a^^ (x) < (4) 

and is a-completely good if for all 0 < z < j, we have (xq, ..., Xj) G I* 
is good. 

The corresponding dehnition for fi is more complicated. We remark 
that in the presence of a strict entropy gap, H{a) > H{/3), the dehni¬ 
tion could be more simple and symmetric (see for example, m page 
366] or [iHl Proof of Proposition 14]. We declare that every y G Jq is 
(3-good. Set 

Lj := (fcgroup - l)j for 0 < j < n. 
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SO that Lj = Lj+ninitiai+j- We say that y = {yo, {yi, y[),..., {yj, y'j)) G 

Jj is 13-good if 

/3^>(y) > (5) 

Note that being /3-good does not depend on the behavior of the coor¬ 
dinates (i/o, y[,..., y'j) and appears in the exponent rather than Lj 
on the right hand side of ([5]). We say that y is 13-completely good if 
for all 0 < f < j, we have y^ = (i/o, iyi.y'i), • • •, iyi, y'i)) G J* is good. 

Note_that if y G is not completely good, then for some j > 1, we 
have /3^^'(yj) < and by ([3]), 

For two elements y, z G Jj, we say that they are equivalent if 
y = z. We let [y] C Jj be the equivalence class of y. Given a measure 
on \j X Jj we say it finely splits an element x G Ij if there exists 
y, z G Jj such that [y] 7^ [z] and for which the measure assigns positive 
mass to both (x, y) and (x, z). 

For j > 0, let Wj = (Xj, Yj) be a random variable given by the 
iterative star-coupling with replacement of Zq, Zi,Zj, where we set 
Wo ;= Zq; thus (Xj, Yj) takes values in Ij x Jj. We say that x G Ij is 
desirable if the following properties are satished. 

(a) The element x is a-completely good. 

(b) The element x is not hnely split by (the law of) Wj = (Xj, Yj). 

(c) Furthermore, up to equivalence, there is a unique /3-completely 
good y G Jj for which (x, y) receives positive mass under (the law 

of) W,. 

For desirable x G Ij, set Tj(x) = y, where y is determined by condition 
(jcj); otherwise if x is not desirable simply set fE'j(x) = y' for some 
predetermined hxed y' G Jj. Note that 

p(Yj = Tj(Xj)) > P(Xj is desirable). 

Remark [m and del Junco’s inductive argument |H1 Lemma 4.6] will 
be used to show that for all j > 0, 


P(Xj is not desirable) 


< P(Xj is not c.g.) -I- P(Yj is not c.g.) -|- 

(7) 


I I fcgroup ^ ^ p sLi 


i=0 


where “c.g.” is short for completely good. 

The case j = 0 is vacuous, since being good implies being completely 
good, and under Zq no elements are hnely split. 
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Assume (I7j) for the case j —1 > 0. We show that ([7]) holds for the case 
j. Let E be the event that Xj_i is desirable, but is not desirable. 
Clearly, 

P(Xj is not desirable) < P(Xj_i is not desirable) + P(i?). (8) 

Note that on the event E, the random variables Xj_i and Yj-i are 
completely good. Observe that the event E is contained in the following 
three events 

(I) El := The random variable X^ is not good, but Xj_i is com¬ 
pletely good. 

(II) E 2 := The random variable Xj is completely good, but is hnely 
split under the iterative star-coupling Wj, even though Xj_i is 
desirable. 

(Ill) E^ := The random variable Yj is not good, but Yj_i is com¬ 
pletely good. 

Clearly, 

P(i?i) -I- P(Xj_i is not c.g.) = P(Xj is not c.g.). (9) 

Similarly, 


P(E 3 ) -|- P(Yj_i is not c.g.) = P(Yj is not c.g.). (10) 


Let us focus on the event E 2 . Let X^ = (Xj_i,X), so that X takes 
values in We show that for any x G completely 

good y G Jj-i that 

F{E2 I X = X, Yj_i = y) < 

so that P(i? 2 ) < and it follows that ([7]) holds by ([8]), 

0. (HB, and the inductive hypothesis. 

Note that if x and y are good, then 


P(Xj_i = x|X 


_ P(Xj_i = X, Y= y, X = x) 
P(Y,_i = y,X = x) 
P(X,_i = X, Y,-_i = y) 

P(Y,_i = y) ^ ^ 

^ P(X,_i = x) 

- P(Y,-i = y) 

< (13) 


where ffT2|) follows from the independence properties of the star-coupling 
(with replacement) and flT^ follows from (jl]) and ([5]). Also note that 
if X is desirable, then if (x,x) is hnely split under W^, then for the 
unique, up to equivalence, y for which (x, y) receives positive mass 
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under Wj_i there exist {y, u), {y', u') G _ ^jfcgroup-i ^ 

that y ^ y' and both ((x, a:), (y, i/, n)) and ((x, a:), (y, i/', u')) receive 
positive mass under W^. By Remark [TTl and the dehnition of the star- 
coupling with replacement, for a hxed x G y ^ ggi^ 

of all X such that there exists distinct y, y' G R^g>-oup-i fgj. which there 
are u,u' G B such that both ((x, a;), (y, y, u)) and ((x, a;), (y, y', n')) 
receive positive mass under Wj has at most |R|^group-i — 1 elements; 
thus summing over all such x yields fllll) . 

The Shannon-McMillan-Breiman theorem implies that Uinitiai can be 
chosen so that all three terms in ([7]) can be made smaller than 77 / 8 . 
This is done in the following way. Set 

SA{k, K) := {a G : a^(a) < for sl\k<^<K] 

and 

SB{k, K) := {b G : /3^(b) > for ail k < ^ < K] , 

where we have the slight abuse of notation that if a = (oi,... ,ai^), 
then a^(a) = a^(ai, ..., a^). 

First, by the Shannon-McMillan-Breiman theorem choose k so that 
for all iC > K, we have 

1 - 77 / 3 . (14) 

Next, using the Shannon-McMillian-Brieman theorem again, choose 
i^initiai sufficiently large so that the following three inequalities are sat- 
ished: 

^^(^^(ninitiai, K)) > 1 - 77/3 for all K > rzinitiai, (15) 
min {/9^(y) > 0 : y G 0 < £ < k} > (16) 

and 

CXD 

|^|fcg,oup ^ 

^“^initial 

Finally, we will verify that this choice of Tiinitiai is sufficient. Condition 
03 gives that P(Xj is not c.g.) < 77 / 8 . Recall that by dehnition, Lj > 
’^initial for all j > 0, so that flTTl) ensures that |R|*^g"-°“p 0 e < 77 / 3 . 

It remains to verify that P(Yj is not c.g.) < 77 / 3 . 

The dehnition of completely good, ([5]), gives that if 

y = e Jn 

is not completely good, then for some i > 0, we have 

(18) 
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where y* = (i/o, (?/i, i/'i), • • •, (2/*, 2/D); inequalities (IIH]) and ([I6]) imply 
that 

Li > k; (19) 

moreover, ([H]) gives that 

Hence if y is not completely good, then by 
to the complement of Sb{i^,K) for all K > 

P(Yj is not c.g.) < rj/3. 

In Proposition [131 we have that given x G 
up to equivalence, it determines a corresponding 

y = (2/0, (2/1, y'l),---, (2/n, y'j)) e J^. 

It will be useful to refer to {yo, y[,..., y'^) as the undetermined coor¬ 
dinates^ and (?/i,..., yn) as the destined coordinates. We say that 
there are riinitiai + n undetermined coordinates, since yo G and 

i/j G -B for 1 < / < n, and there are (/Cgroup — l)n destined coordinates. 

6. Perturbing the joining 

Let ^ G J be a monotone joining of marker form. We will define a 
perturbation of ^ using the iterated star-coupling with replacement. 
The perturbation will depend on a few parameters. With the help of 
Proposition [131 we will be able to make a choice of these parameters 
so that will be an almost factor and close to ^ in the metric defined 
in dSl). 

6.1. Defining the pertnrbation. Let /Cmark < i"mark be large integers 
to be chosen later. A secondary marker is the maximal union of 
at least /cmark consecutive primary markers, so that secondary markers 
have no hller between them and if the interval [i,j] is a secondary 
marker for a; G hi, then x restricted to [i,j] has the form 0101 ■ ■ ■ 0101. 

Similarly, a tertiary marker is the maximal union of at least r^ark 
consecutive primary markers. We call the set of integers between but 
not including two secondary markers a block, and the set of integers 
between but not including two tertiary markers a city. Thus within 
a city there are blocks, which we consider ordered from left to right. 
Note that a block may contain primary markers. 

Let p G (|,1)- Let ^ G J{p). Let Z = {X,Y) have law Suppose 
that we are given that Z has primary markers given by t G T. Let 
/ C Z be a block of length n. We are interested in the distribution 
of the random variable taking values in { 0 , 1 }"" x { 0 , 1 }” given by the 
distribution of Z, conditioned on t, restricted to I. The type of the 


( 20 ) 

(IT^ and ([ 201 ) it belongs 
K. Thus f[TTj) gives that 

□ 

Im with high probability. 
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block / is defined to be the vector containing an alternating sequence of 
integers that are the lengths of the filler and marker intervals in I and 
the length of the type is simply the sum of the integers in the type 
which give the length of the block. The distribution of this random 
variable is determined by the parameter p and the type of I. There 
are a countable number of types. Fix an enumeration (typejjgis} of the 
types and let pi be the corresponding law. Associate to each type-i 
block a large integer which will be chosen later; here i is an 

index that is not an exponent. A census of a city is the sequence of 
nonnegative integers Cj, where each c* is the number of type-i blocks in 
the city. 

Remark 14. Note that for every i G N, if the length of the type-i is n, 
then Pi is a probability measure on {0,1}” x {0, !}"■ with projections 
ttj and /3j that have equal entropy. The equal entropy assertion also 
follows from the duality between p and 1 — p. 0 

A modification of Z = {X, Y) on & subset of Z is a coupling of X 
and Y given by Z' = {X', Y') such that Z' is equal to Z off the subset 
and has the same primary markers as Z. We will define a modification 
Z' of Z so that the law of Z' will be a member of J. The modifications 
will be made independently on each city, so that we need only define 
what changes occur on a city. On each city the modifications will be 
made independently on each set of types, so that we need only define 
what changes occur on each set of types. 

Suppose that the primary markers of Z are given by f G T. Fix 
f G N. Let us focus on the type-z blocks in a single fixed city. We will 
refer to this modification as the star-modification of type-i on a 
city. Suppose that the census c is such that we may write 

O ~ ^initial A Q'i^group + I’i) (21) 

where 0 < < fcgroup and qi is an nonnegative integer. We will not 

make modifications on the last blocks. Suppose that length of the 
type-i block is n. It may be helpful to think of two different copies of 
{0, !}"■ by setting A = B = {0, !}"■. Let W = be the set of ran¬ 

dom variables taking values in Ax B obtained by taking the restriction 
of Z, conditioned to have primary markers given by t, to each block of 
type-i in the city. Although W gives an identical sequence, where each 
Wj has law pj, it may not be independent. However, the projections on 
A and B are independent; if we write Wj = {Xj, Yj), then by Remark 
|5l we have that X = {Xfi and Y = (Yj) are i.i.d. sequences. Consider 
the first random variables together as a single random variable 
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taking values in [A x and each subsequent fcgroup random vari¬ 

ables together as random variables taking values in [A x 
obtain a sequence of random variables M = (Mq, Mi, ..., Mq^). Thus 
M takes values on 

(y4 X iJ)”'initial + 9i^group ^ J^AnltlMli^S^oup y ^J^initial + li^group _ 

Take the iterative star-coupling with replacement of these random vari¬ 
ables to obtain new random variables M' = further¬ 

more, using independent randomization, we may stipulate that these 
random variables are independent of Z. We dehne a modihcation Z' 
of Z by replacing the values of M with those of M', so that Z = Z' off 
the type-i blocks in the city, and if Z restricted to the type-i blocks, 
then it is given by M, then Z' restricted to the type-z blocks is given 
by M'. The iterated star-coupling with replacement gives that the law 
of each projected onto each of the Uinitiai + ^i^group copies of A x 5 is 
Pi, so that monotonicity is preserved and the primary markers remain 
unchanged. Also, the projections of M and M' on have the same 

law. Similarly, the projections of M and M' on have the same 

law, so that by Remark [5l the random variable Z' gives the required 
coupling. 

Note we have only dehned the star modihcation of type-z when Cj > 
^initial + ^group- In the case that Cj is not sufficiently large, we simply 
do nothing, that is, we stipulate that the star modihcation of type-z 
leaves everything unchanged. 

For a single type-z, if we apply the star modihcation of the type on 
each city, independently, then we obtain a modihcation of Z that has 
law that belonging to J. We call this the star-modification of type-i 
of Z. We summarize our construction in the following proposition. 

Proposition 15. Let p G (|, 1) and ^ G J{p)- If Z has law ^ and if 
Z' is a star modification of Z of a particular type, then the law of Z ’ 
is also a member of J (p). 

Given a hnite set of types, the star-modification of Z on the set 
of types is obtained by applying the star-modihcation in succession, 
starting with the smallest type. 

Remark 16. In our construction of the star modihcation of type-z on a 
city, we relied on the fact that the law of Mj still has a projection of pi 
on each copy of A x R to ensure that primary markers and monotonicity 
are preserved. This fact will also be important for us later in proving 
that the parameters of the star-modihcation Z' of Z can be chosen so 
that it is a small perturbation in the d-metric, since on the event that 
the origin is contained in a block, and the coordinates of a cylinder set 
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C lie in that block, we have that the probabilities of C under Z and Z' 
are not only close, they are equal! This is another one of nice features 
of del Junco’s star coupling. 0 

6.2. Choosing the parameters. From the discussion in Section [T3l 
it remains to show that given a joining ^ G J(p), we can choose pa¬ 
rameters so that the star-modihcation of a random variable with law 
Z on a hnite set of type results in a random variable with law that 
is close to ^ in the metric dehned by ([ 2 ]) and is also an almost factor. 

Proof of TheoremUl Let p G (|, 1). Let e > 0. As discussed in Section 
13.31 it suffices to show that Us, the set of e-almost factors from 5(1 — 
p,p) to 5(p, 1 — p) is dense. The proof that 14, the set of ^-almost 
factors from B{p, 1 —p) to 5(1 — p, p) is similar with the roles of p and 
1 — p reversed. 

Let G J(p) and Z have law 4 Let £ > 0. We will choose the pa¬ 
rameters for the star-modihcation Z' of Z as follows. The modihcation 
will occur on a hnite set of types T, which will be specihed later. Re¬ 
call that in the star-modihcation, some blocks are left unchanged^ so 
that Z equals Z' on those blocks, and whereas some blocks are mod¬ 
ified via the iterated star-coupling with replacement, so that Z may 
not equal to Z' on those blocks. Note that Z' and Z always share the 
same primary markers, and although markers may lie in the modihed 
coordinates they are always preserved. If is the law of Z', then these 
parameters will be chosen so that d(^,^') < £ and G 5^. We choose 
the parameters as follows. 

(i) Set e' := e/100. 

(ii) Let 5 > 0 be small enough and £* be large enough so that two 
measures C and on { 0 , 1 }^ x { 0 , 1 }^ are e' close in the metric 
d*, if for all cylinder sets C G C^*, we have |C(C') — C^L*)! < 

(iii) Choose /cmark sufficiently large so that with probability at least 
\ — s' the origin is in a block and the interval [— 2 £*, 2£*] is in the 
block. 

(iv) With this choice of fcmark, there exists L > 0 such that with prob¬ 
ability at least 1 — 2e' the origin will in a block and the length of 
the block will be between I* and L. 

(v) In particular, there exists a hnite set of types T, those with lengths 
between £* and L, such that with probability at least 1 — 2^', each 
block will be of type T. Since we have a hxed enumeration of the 
types, we will view T as a subset of N. 

(vi) Set fcgroup =\l/e'^+l. 
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(vii) For each i G N, choose via Proposition [131 by substituting 

P = A, ^initial = ’^initiab ^nd T] = s'. 

(viii) Let c be the census of the city containing the origin. If Cj is 
sufficiently large, dehne g, as in fl2l|) . Choose rmark sufficiently 
large so that with probability at least 1 —e', the origin is in a city, 
and for alH G T the census will satisfy 

Applying Proposition [15] a hnite number of times gives that (' G J. 
By Remark [TBl conditions ([I|), (|II|), and dm]), imply that d(C^C) < 

It remains to verify that (' G t4. Call t G T a model marker if 
the block containing the origin is a modihed block and the origin lies 
in a destined coordinate. Property flviil) and Proposition [13] imply that 
for all model markers f G T there exists a deterministic measurable 
-0 : ^ {0,1} such that 

Ct{{x,y) ■ {x,yo) = {x,ip{x))} >l-e'. (22) 

For a particular type-i, with Cj = ’^initial+ ?*^group+A as in f[?I]) the ratio 
of undetermined coordinates plus those that are unchanged to destined 
coordinates is + Qi + ri)/qi{kgroup - !)• Recall that r* < fcgroup- 

Conditions dl^)) fej); (El])) and fiviiip . ensure us that the set of model 
markers has probability at least 1 — 7e'-, this fact together with f[22|) 
and dlj) imply that & J. □ 


7. Some other examples 

One of the key observations of Keane and Smorodinsky Lemmas 
2 and 3] that allowed the dehnition of markers in their proof of that 
two Bernoulli shifts B{p) and B{q) of equal entropy are isomorphic 
was that one could assume without loss of generality that po = Qq in 
the case where p and q give non-zero mass to three or more symbols, 
and in the case where p gives non-zero mass to only two symbols, 
then one can assume that p^pi = q^qi from some k. In general, in 
the construction of monotone factors, we may not make this reduction 
since monotonicity may not preserved. However by a straightforward 
adaptation of the proof of Theorem [1] the following monotone versions 
of the Keane and Smorodinsky reductions are enough to prove the 
existence of a monotone isomorphism. 

Theorem 17. Let N > 2. Let p and q be probability measures on 
[iV] of equal entropy. Suppose p stochastically dominates q, and fur¬ 
thermore there exists i > j such that pi = qj and p* stochastically 
dominates q*, where p* is the law of a random variable with law p 
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conditioned not to take the value i, and q* is the law of a random vari¬ 
able with law q conditioned not to take the value j. Then there exists 
a monotone isomorphism of B{p) and B{q). 

Theorem 18. Let N >2. Let p and q he probability measures on [iV] 
of equal entropy. Suppose p stochastically dominates q, and further¬ 
more there exists i > j and k > i such that piPk = qjqi and for all 
n > 1, we have that p""* stochastically dominates , where p"* is the 
law of a random vector with law p" conditioned so that an occurrence 
of an i is never immediately followed by an occurrenee of a k, and q"* 
is the law of a random vector with law q"^ conditioned so that an oc¬ 
currence of a j is never followed by an oeeurrence of an i. Then there 
exists a monotone isomorphism of B{p) and B{q). 
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